AITopics | time-delay neural network

Large scale machine learning-based Raga identification continues to be a nontrivial issue in the computational aspects behind Carnatic music. Each raga consists of many unique and intrinsic melodic patterns that can be used to easily identify them from others. These ragas can also then be used to cluster songs within the same raga, as well as identify songs in other closely related ragas. In this case, the input sound is analyzed using a combination of steps including using a Discrete Fourier transformation and using Triangular Filtering to create custom bins of possible notes, extracting features from the presence of particular notes or lack thereof. Using a combination of Neural Networks including 1D Convolutional Neural Networks conventionally known as Time-Delay Neural Networks) and Long Short-Term Memory (LSTM), which are a form of Recurrent Neural Networks, the backbone of the classification strategy to build the model can be created. In addition, to help with variations in shruti, a long-time attention-based mechanism will be implemented to determine the relative changes in frequency rather than the absolute differences. This will provide a much more meaningful data point when training audio clips in different shrutis. To evaluate the accuracy of the classifier, a dataset of 676 recordings is used. The songs are distributed across the list of ragas. The goal of this program is to be able to effectively and efficiently label a much wider range of audio clips in more shrutis, ragas, and with more background noise.

carnatic music, neural network, raga, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.13140/RG.2.2.17517.40164

2405.16

Country:

North America > United States > New York > Richmond County > New York City (0.04)
North America > United States > New York > Queens County > New York City (0.04)
North America > United States > New York > New York County > New York City (0.04)
(6 more...)

Genre: Research Report (0.64)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Representation and Induction of Finite State Machines using Time-Delay Neural Networks

Neural Information Processing SystemsApr-6-2023, 18:13:19 GMT

This work investigates the representational and inductive capabili(cid:173) ties of time-delay neural networks (TDNNs) in general, and of two subclasses of TDNN, those with delays only on the inputs (IDNN), and those which include delays on hidden units (HDNN) . Both ar(cid:173) chitectures are capable of representing the same class of languages, the definite memory machine (DMM) languages, but the delays on the hidden units in the HDNN helps it outperform the IDNN on problems composed of repeated features over short time windows.

finite state machine, representation and induction, time-delay neural network, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Representation and Induction of Finite State Machines using Time-Delay Neural Networks

Clouse, Daniel S., Giles, C. Lee, Horne, Bill G., Cottrell, Garrison W.

Neural Information Processing SystemsDec-31-1997

This work investigates the representational and inductive capabilities of time-delay neural networks (TDNNs) in general, and of two subclasses of TDNN, those with delays only on the inputs (IDNN), and those which include delays on hidden units (HDNN). Both architectures are capable of representing the same class of languages, the definite memory machine (DMM) languages, but the delays on the hidden units in the HDNN helps it outperform the IDNN on problems composed of repeated features over short time windows.

idnn, representation and induction, tdnn, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.14)
North America > United States > California > San Diego County > San Diego (0.05)
North America > United States > California > San Diego County > La Jolla (0.05)
(2 more...)

Genre: Research Report (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Representation and Induction of Finite State Machines using Time-Delay Neural Networks

Clouse, Daniel S., Giles, C. Lee, Horne, Bill G., Cottrell, Garrison W.

Neural Information Processing SystemsDec-31-1997

This work investigates the representational and inductive capabilities of time-delay neural networks (TDNNs) in general, and of two subclasses of TDNN, those with delays only on the inputs (IDNN), and those which include delays on hidden units (HDNN). Both architectures are capable of representing the same class of languages, the definite memory machine (DMM) languages, but the delays on the hidden units in the HDNN helps it outperform the IDNN on problems composed of repeated features over short time windows.

idnn, representation and induction, tdnn, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.14)
North America > United States > California > San Diego County > San Diego (0.05)
North America > United States > California > San Diego County > La Jolla (0.05)
(2 more...)

Genre: Research Report (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Representation and Induction of Finite State Machines using Time-Delay Neural Networks

Clouse, Daniel S., Giles, C. Lee, Horne, Bill G., Cottrell, Garrison W.

Neural Information Processing SystemsDec-31-1997

This work investigates the representational and inductive capabilities oftime-delay neural networks (TDNNs) in general, and of two subclasses of TDNN, those with delays only on the inputs (IDNN), and those which include delays on hidden units (HDNN). Both architectures arecapable of representing the same class of languages, the definite memory machine (DMM) languages, but the delays on the hidden units in the HDNN helps it outperform the IDNN on problems composed of repeated features over short time windows. 1 Introduction In this paper we consider the representational and inductive capabilities of timedelay neuralnetworks (TDNN) [Waibel et al., 1989] [Lang et al., 1990], also known as NNFIR [Wan, 1993]. A TDNN is a feed-forward network in which the set of inputs to any node i may include the output from previous layers not only in the current time step t, but from d earlier time steps as well. The activation function 404 D.S. Clouse, C. L Giles, B. G. Home and G. W. Cottrell for node i at time t in such a network is given by equation 1: TDNNs have been used in speech recognition [Waibel et al., 1989], and time series prediction [Wan, 1993]. In this paper we concentrate on the language induction problem.

idnn, simulation, tdnn, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Diego County > San Diego (0.05)
North America > United States > New Jersey > Mercer County > Princeton (0.05)
North America > United States > California > San Diego County > La Jolla (0.05)
(2 more...)

Genre: Research Report (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Connectionist Architectures for Multi-Speaker Phoneme Recognition

II, John B. Hampshire, Waibel, Alex

Neural Information Processing SystemsDec-31-1990

We present a number of Time-Delay Neural Network (TDNN) based architectures for multi-speaker phoneme recognition (/b,d,g/ task). We use speech of two females and four males to compare the performance of the various architectures against a baseline recognition rate of 95.9% for a single IDNN on the six-speaker /b,d,g/ task. This series of modular designs leads to a highly modular multi-network architecture capable of performing the six-speaker recognition task at the speaker dependent rate of 98.4%. In addition to its high recognition rate, the so-called "Meta-Pi" architecture learns - without direct supervision - to recognize the speech of one particular male speaker using internal models of other male speakers exclusively.

architecture, connectionist architecture, neural network, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > District of Columbia > Washington (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Connectionist Architectures for Multi-Speaker Phoneme Recognition

II, John B. Hampshire, Waibel, Alex

Neural Information Processing SystemsDec-31-1990

We present a number of Time-Delay Neural Network (TDNN) based architectures for multi-speaker phoneme recognition (/b,d,g/ task). We use speech of two females and four males to compare the performance of the various architectures against a baseline recognition rate of 95.9% for a single IDNN on the six-speaker /b,d,g/ task. This series of modular designs leads to a highly modular multi-network architecture capable of performing the six-speaker recognition task at the speaker dependent rate of 98.4%. In addition to its high recognition rate, the so-called "Meta-Pi" architecture learns - without direct supervision - to recognize the speech of one particular male speaker using internal models of other male speakers exclusively.

architecture, connectionist architecture, neural network, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > District of Columbia > Washington (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Connectionist Architectures for Multi-Speaker Phoneme Recognition

II, John B. Hampshire, Waibel, Alex

Neural Information Processing SystemsDec-31-1990

We present a number of Time-Delay Neural Network (TDNN) based architectures for multi-speaker phoneme recognition (/b,d,g/ task). We use speech of two females and four males to compare the performance of the various architectures against a baseline recognition rate of 95.9% for a single IDNN on the six-speaker /b,d,g/ task.

architecture, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.29)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Consonant Recognition by Modular Construction of Large Phonemic Time-Delay Neural Networks

Waibel, Alex

Neural Information Processing SystemsDec-31-1989

Encouraged by these results we wanted to explore the question, how we might expand on these models to make them useful for the design of speech recognition systems. A problem that emerges as we attempt to apply neural network models to the full speech recognition problem is the problem of scaling. Simply extending neural networks to ever larger structures and retraining them as one monolithic net quickly exceeds the capabilities of the fastest and largest supercomputers. The search complexity of finding a good solutions in a huge space of possible network configurations also soon assumes unmanageable proportions. Moreover, having to decide on all possible classes for recognition ahead of time as well as collecting sufficient data to train such a large monolithic network is impractical to say the least. In an effort to extend our models from small recognition tasks to large scale speech recognition systems, we must therefore explore modularity and incremental learning as design strategies to break up a large learning task into smaller subtasks. Breaking up a large task into subtasks to be tackled by individual black boxes interconnected in ad hoc arrangements, on the other hand, would mean to abandon one of the most attractive aspects of connectionism: the ability to perform complex constraint satisfaction in a massively parallel and interconnected fashion, in view of an overall optimal perfonnance goal.

consonant recognition, experiment, recognition, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Consonant Recognition by Modular Construction of Large Phonemic Time-Delay Neural Networks

Waibel, Alex

Neural Information Processing SystemsDec-31-1989

Encouraged by these results we wanted to explore the question, how we might expand on these models to make them useful for the design of speech recognition systems. A problem that emerges as we attempt to apply neural network models to the full speech recognition problem is the problem of scaling. Simply extending neural networks to ever larger structures and retraining them as one monolithic net quickly exceeds the capabilities of the fastest and largest supercomputers. The search complexity of finding a good solutions in a huge space of possible network configurations also soon assumes unmanageable proportions. Moreover, having to decide on all possible classes for recognition ahead of time as well as collecting sufficient data to train such a large monolithic network is impractical to say the least. In an effort to extend our models from small recognition tasks to large scale speech recognition systems, we must therefore explore modularity and incremental learning as design strategies to break up a large learning task into smaller subtasks. Breaking up a large task into subtasks to be tackled by individual black boxes interconnected in ad hoc arrangements, on the other hand, would mean to abandon one of the most attractive aspects of connectionism: the ability to perform complex constraint satisfaction in a massively parallel and interconnected fashion, in view of an overall optimal perfonnance goal.

consonant recognition, experiment, recognition, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Filters

Collaborating Authors

time-delay neural network

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Carnatic Raga Identification System using Rigorous Time-Delay Neural Network

Representation and Induction of Finite State Machines using Time-Delay Neural Networks

Representation and Induction of Finite State Machines using Time-Delay Neural Networks

Representation and Induction of Finite State Machines using Time-Delay Neural Networks

Representation and Induction of Finite State Machines using Time-Delay Neural Networks

Connectionist Architectures for Multi-Speaker Phoneme Recognition

Connectionist Architectures for Multi-Speaker Phoneme Recognition

Connectionist Architectures for Multi-Speaker Phoneme Recognition

Consonant Recognition by Modular Construction of Large Phonemic Time-Delay Neural Networks

Consonant Recognition by Modular Construction of Large Phonemic Time-Delay Neural Networks